An Efficient Algorithm for approximating ID Ground States 
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The density-matrix renormalization-group method is very effective at finding ground states of 
one-dimensional (ID) quantum systems in practice, but it is a heuristic method, and there is no 
known proof for when it works. In this article we describe an efficient classical algorithm which 
provably finds a good approximation of the ground state of ID systems under well defined conditions. 
More precisely, our algorithm finds a matrix product state of bond dimension D whose energy 
approximates the minimal energy such states can achieve. The running time is exponential in D, 
and so the algorithm can be considered tractable even for D which is logarithmic in the size of the 
chain. The result also implies trivially that the ground state of any local commuting Hamiltonian 
in ID can be approximated efficiently; we improve this to an exact algorithm. 



I. INTRODUCTION 

Finding ground states of local one-dimensional (ID) 
Hamiltonian systems is a major problem in physics. 
The most commonly used method is the density- 
matrix renormalization-group (DMRG) [lrH, discovered 
in 1992. DMRG can be cast in the form of matrix prod- 
uct states (MPSs) which are succinct representations of 
ID quantum states using D x D matrices, where the co- 
efficients in the state can be written in terms of products 
of these matrices. The number of matrices is dn, where d 
is the dimension of each individual particle and n is the 
number of particles in the system. The parameter D is 
called the bond dimension. DMRG works essentially as 
follows: The algorithm starts with some initial MPS and 
sweeps from one end of the chain to the other, optimizing 
the entries of the matrices at one site with the other pa- 
rameters fixed. Some versions allow optimizing over two 
neighboring sites at once, which enables the algorithm 
to increase the bond dimension in the course of the algo- 
rithm for improved accuracy. In all cases, the approach is 
to apply local optimizations iteratively. It is thus easy to 
construct examples in which the DMRG algorithm gets 
trapped in a local minimum. To illustrate this, think of 
a ID spin chain whose Hamiltonian consists of two types 
of interactions: One type consists of interactions which 
force the spins to be aligned; every two neighboring sites 
gain an energy penalty of say 4 if they are not aligned. 
The other type of term gives every spin an energy penalty 
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of 1 if it points upward. Starting from the all-up string, a 
local move only increases the energy; thus, local update 
rules cannot take the system to its ground state, the all- 
down string. This example can of course be handled by 
randomizing the initial string, for example, or increasing 
the window size; however, it demonstrates that DMRG 
has a fundamental difficulty in addressing non local char- 
acteristics of the system. It is natural to ask if there is a 
general algorithm that does not get stuck in local minima 
as DMRG does and provably always find a good approx- 
imation of the ground state of a given ID system in a 
reasonable amount of time. 

To answer this question, we first ask what is known 
regarding the analogous question in the easier, classical, 
case. It was Kitaev Q who drew the important connec- 
tion between the problem of finding ground energy and 
ground states of local Hamiltonians, and the well-known 
classical constraint satisfaction problem (CSP). The in- 
put to a CSP consists of constraints {H c } c on n g-state 
classical particles. Each H c acts on k particles (for some 
constant A;) and is given as a Boolean function on the 
possible assignments to those k particles; when H c — 1 
the configuration is forbidden and when H c = it is al- 
lowed. The problem is to determine the maximum num- 
ber of constraints that can be satisfied, or alternatively, 
to minimize ^2 C H C . The decision version of this prob- 
lem is to determine whether it is possible to satisfy more 
than some given number of constraints. This is one of 
the most well-known NP-complete problems. CSP can 
clearly be seen as a special case of the problem of finding 
ground states and ground energies of local Hamiltonians, 
in which the terms in the Hamiltonian are projections on 
local forbidden configurations. This analogy has led over 
the past few years to many interesting insights regarding 



2 



the local Hamiltonian problem (see, e.g., Ref. @~Q3)- 

Let us therefore see what the known classical results 
regarding CSP in ID can teach us about ID local Hamil- 
tonians and their ground states. We recall that in the 
classical case, ID CSPs (in which the particles are ar- 
ranged in a line and constraints are between k adja- 
cent neighbors) are dramatically easier than their higher- 
dimensional counterparts. While even the 2D case is NP 
complete, the ID problem can be solved in polynomial 
time. The reason for the tractability of the problem in 
ID is essentially that the problem can be divided into 
sub problems, namely, the left- and the right-hand sides 
of the chain, which interact only via the k particles on 
the border. The fact that these particles can only be 
assigned a small number of possible values makes it pos- 
sible to handle the problem by solving each sub problem 
separately for each fixed possible assignment to the bor- 
der particles and then gluing the sub solutions together 
by picking the best choice for the middle particles. We 
explain the algorithm in detail later; the outcome is an 
algorithm which is linear in the number of particles in the 
chain and quadratic in the number of states per particle. 

Unfortunately, there is no hope of getting such a gen- 
eral result for the ID quantum problem. Aharonov et al 
[loj have shown that approximating the ground energy 
for general ID quantum systems is as hard as quantum- 
NP. Even when restricted to ground states that are well- 
approximated by MPSs of polynomial bond dimension, 
the problem is NP-hard, as was shown by Schuch et al 
[l3j. A related earlier result due to Eisert [l4| showed 
that optimizing a constant number of matrices in the 
MPS representation subject to fixed values in the other 
matrices is NP-hard. These results indicate that the di- 
chotomy between the computational difficulty of ID and 
2D classical systems does not carry over to the quantum 
setting, and it is highly unlikely that the quantum ID 
problem is tractable. Nevertheless, we show here that 
using the classical ID algorithm as a template for an al- 
gorithm for the quantum problem leads to a solution for a 
wide and interesting class of local Hamiltonian problems, 
namely, for those cases in which we can assume that the 
bond dimension is small. 



A. Main Result 

We derive an efficient algorithm for approximating the 
minimal energy of a ID system among all states of a 
bounded bond dimension D. The algorithm is exponen- 
tial in D and thus can be considered reasonable, though 
maybe not practical, even for D, which is logarithmic 
in the size of the chain. The algorithm also provides 
a description of an MPS with the approximate minimal 
energy. 

Theorem 1 Let H be a nearest-neighbor Hamiltonian on 
a ID system of n d- dimensional particles. Let J be a 
bound on the operator norm of each local term. There is 



an algorithm that takes as input e, H and D and produces 
an MPS |0) of bond dimension D, such that for any MPS 
of bond dimension D with nD 2 > 12, 

(n\H\n) < (ip\H\ip) + 2JD 2 n 2 e . (1) 

The algorithm runs in time n ■ poly(d, -D, N), where N = 
q n44,dD \ D + 2dD 

Several remarks are in place here. First, note that the 
restriction that the interactions are nearest neighbor is 
done without loss of generality since any ID system can 
be reduced to a 2-local ID system with nearest neighbor 
interactions by grouping neighboring particles together. 

Note also that the running time in the above theorem 
is phrased as a linear function in n, the size of the system, 
times some fixed amount of time spent per particle. The 
error, however, scales with n 2 . One may want to apply 
the theorem to derive an approximation with a fixed ad- 
ditive error 5, in which case simply set e = Sn~ 2 in the 
above theorem to get the running time as a function of 
S. 

This result shows that the problem of finding bounded 
bond dimension MPSs can be done in polynomial time. 
Unfortunately, the running time, though efficient in the- 
ory, is quite impractical, as even for D = 2 and the error 
e/n 2 a constant, we get a running time which scales like 
n le . It is hard to imagine that these running times are 
practical. Nevertheless, it is very likely that the run- 
ning time can be improved; in particular, when solving 
specific problems with certain symmetries, dramatic im- 
provements may be possible. Moreover, it is possible 
that this algorithm can be used to boost DMRG in cer- 
tain cases where it gets stuck or to create the initial state 
of DMRG. All these improvements are left for further re- 
search. 

We now provide an overview of the algorithm. To un- 
derstand the general idea, we first recall how the classical 
ID algorithm works in detail. Consider the case of the 
classical CSP on a line with k = 2, namely the prob- 
lem of minimizing the energy function H — X)"=i 
An optimal assignment can be found efficiently by a stan- 
dard algorithmic technique called dynamic programming. 
Define the partial problem up to the (r + l)th particle, 
H r = X)I=i The algorithm starts with the par- 

tial problem defined for r = 1 and creates a list Li of 
possible assignments to the first two particles as follows: 
For each of the q possible assignments a 2 to particle 2, 
the algorithm finds an assignment o\ to particle 1 which 
minimizes H\{o\,o-i). That optimal o~\ is called the tail 
of (72- For each a 2 the algorithm keeps its tail o~\ and 
also the energy of this partial assignment, i?i(<7i, o~2). L 2 
thus contains the best possible partial assignment with 
each possible ending. After r — 1 iterations, we assume 
the algorithm has a list L r consisting of an optimal tail 
<7i, ...,o>_i for each of the q possible assignments a r to 
the rth particle, where optimality is measured with re- 
spect to H r -\. In other words, the algorithm has a solu- 
tion to the subproblem confined to the first r particles, 
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with any possible ending. To include the next particle, 
and create the next list £ r+1 , the algorithm finds the op- 
timal tail of each assignment oy+i. This is done by con- 
sidering all items in the list L r as possible tails for ay+i 
and taking the tail which minimizes H r (ai, cr r +i). In 
each of the n— 1 iterations, the algorithm checks for each 
of the q possible assignments oy, all q items in the list 
L, _i. Thus, in time which is linear in n and quadratic 
in q, we can derive the final list L n _i. The final solution 
is an assignment of minimal energy in that list. 

The main idea in this article is to generalize the above 
algorithm to MPSs by replacing assignments to parti- 
cles by possible values of MPS matrices. Since matrices 
are continuous objects, we use an e-net over all possible 
matrices of bond dimension D. The number of possible 
assignments to one variable, q, will now be replaced by 
the number of points in the e-net, denoted as N. We 
will move from one site to the next, keeping track of the 
minimum-energy MPS state, which ends in each MPS 
matrix for the right most particle that the algorithm has 
reached. 

In order to carry out this idea, it must not happen 
that the choice of the MPS matrix of a later iteration can 
change the optimality of the partial MPS state found in 
an earlier iteration. To avoid this, we work with a re- 
stricted form of MPSs called canonical MPSs, in which 
the energy of each term in the Hamiltonian depends only 
on MPS matrices associated with nearby particles. There 
are, however, various technical issues we need to handle. 
In particular, we cannot use perfectly canonical MPSs 
but only an approximated version of those, which im- 
poses further technicalities, and in particular, the neigh- 
boring MPS matrices do not match perfectly (we call this 
imperfect stitching). These technicalities make the error 
analysis a bit subtle. Before we formally define canon- 
ical MPSs and provide the details of the algorithm, we 
mention an implication for a related problem. 



B. Commuting Hamiltonians in ID 

A problem related to finding minimum-energy MPS 
states is the complexity of calculating the ground en- 
ergy of commuting Hamiltonians in which all the local 
terms commute. Bravyi and Vyalyi proved that for 2- 
local Hamiltonians the problem lies inside NP Q. For 
k- local commuting Hamiltonians with k > 2, the com- 
plexity of the problem is still open. The complexity of the 
ID case was not studied before as far as we know; an im- 
mediate corollary of Theorem[T]is that there is an efficient 
classical algorithm for approximating the ground energy 
of commuting Hamiltonians in ID to within l/poly(n). 
This is because the ground state of a commuting Hamilto- 
nian in ID is an MPS of constant D (this is a well-known 
fact that we explain later for completeness), and there- 
fore Theorem [T] can be applied. In fact, the result can 
be improved to an exact algorithm (up to exponentially 
good approximations due to truncations of real numbers) 



for a certain general class of problems. We prove the fol- 
lowing. 

Theorem 2 Given is a ID Hamiltonian whose terms 
commute. There is an efficient algorithm that can com- 
pute the ground energy of this Hamiltonian to within any 
desired accuracy e in time polynomial in n and in-. If we 
may assume also that the ground space of the total Hamil- 
tonian is well separated from the higher excited states, by 
a spectral gap which is at least l/poly{n), then the algo- 
rithm can find both the ground energy and a description 
of an MPS for the ground state exactly (i.e., up to expo- 
nentially small errors due to handling of real numbers). 

The basic idea for the exact algorithm can be illus- 
trated when the terms in the Hamiltonian are all pro- 
jections and the ground state is unique. Since the terms 
commute, the ground state is an eigenstate of each term 
separately, with eigenvalue either or 1. We start by 
applying the dynamic programming algorithm, to create 
a good approximation of the ground state. From this 
approximation we can deduce the correct eigenvalue (0 
or 1) for each of the terms. The projections on the rele- 
vant eigenspaces can then be applied to the MPS of the 
approximate state to make it exact. One gets a tensor 
network of small depth, which can be converted into an 
MPS again. It can be shown that applying the projec- 
tions does not increase the bond dimension of the MPS 
too much with respect to the approximating state. The 
details are fleshed out in the proof (Sec. W\ . 

Handling the degenerate case is very easy; essentially, 
we force the dynamic algorithm to choose one state of the 
various possible states. The assumption on the spectral 
gap ensures that the errors created by the epsilon net 
approximations would not cause a confusion between the 
ground space and some excited states. 

We provide an alternative proof of Theorem [2] which 
also uses dynamic programming. In fact, this proof holds 
for a somewhat stronger version of the theorem, in which 
the conditions on the spectrum are far less restrictive. 
In the algorithm given by this approach, the state is not 
provided as an MPS but rather as a tensor product of 
two-particle states. The construction is based on the 
work of Ref. in which it is proved that the ground 
states of 2-local commuting Hamiltonians have this spe- 
cial structure. Bravyi and Vyalyi use this structure to 
show that general 2-local commuting Hamiltonians prob- 
lem is in NP. Since ID chains with fc-local interactions 
can always be made 2-local by treating nearby particles 
as one particle of a larger dimension, Ref. [8[ implies 
that the ID commuting problem lies in NP. However, 
by exploiting the special form of these ground states, dy- 
namic programming can be applied to find the solution 
efficiently in a very similar manner to the ID CSP, in 
which the NP witness is found using the ID structure. 
Unfortunately, in this approach too, it seems that one 
cannot avoid some assumption on the spectrum of the to- 
tal Hamiltonian, albeit a significantly less restrictive one. 
Throughout its execution, the dynamic programming al- 
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gorithm compares various partial energies. If these are 
too close, and cannot be distinguished even by computa- 
tions performed with exponentially good precision, then 
the algorithm might get confused between the ground en- 
ergy and a slightly excited state. A sketch of the alterna- 
tive proof of Theorem [2j providing the stronger version 
of it, and a discussion of the above precision issue are 
given in Sec. [Vj 

We mention that this latter proof (and in particular the 
observation that dynamic programming can be useful for 
ID quantum systems and not only for ID classical sys- 
tems) was the inspiration for the current article, rather 
than its corollary. 

C. Discussion and Open Questions 

It is natural to ask how much the results in this ar- 
ticle can be improved. By Ref. [13j . we know that no 
polynomial algorithm exists for finding optimal approx- 
imations of polynomial bond dimension (unless P=NP). 
However, the difficult instances of Ref. |13| have a spec- 
tral gap of l/poly(n). Hastings has shown that ground 
state of ID quantum system with a constant gap can be 
approximated by a MPS with polynomial bond dimen- 
sion [l5j . However, this is too large to immediately yield 
an efficient algorithm from our result. It may still be 
true, however, that under the additional restriction that 
the Hamiltonian has a constant gap, a polynomial time 
algorithm exists, even when the bond dimension is as 
large as polynomial. 

It is very likely that the efficiency of our algorithm 
can be significantly improved even for the general case. 
In particular, a factor of n would be shaved from the 
error in Theorem [1] if we could use an e-net which is both 
exactly canonical and enables perfect overlap between 
matrices at neighboring particles, as we later explain. 
Unfortunately, even if this can be done, the running time 
for this general algorithm is still quite large. 

As mentioned earlier, we leave for further research the 
question of how this algorithm can be used in combi- 
nation with DMRG, and how certain symmetries in the 
problem can be utilized to enhance its performance time 
for specific interesting cases. 

We note that very similar results to those presented 
in this article were derived independently by Schuch and 
Cirac 

D. Paper Organization 



Section |H] starts by defining tensor networks, MPSs 
and canonical MPSs. In Sec. |Hl] we describe the algo- 
rithm. This is where the e-nets are defined and an al- 
gorithm to generate them is given. Also in Sec. IIII1 we 
show how they are used in the dynamic programming al- 
gorithm. Sec. II VI provides an exact analysis of the error 



accumulated in the algorithm. The complexity is ana- 
lyzed as a function of the desired error. In Sec. [V] we 
provide the proof regarding the approximate and exact 
solutions for the commuting ID case. We defer several 
technical lemmas to the Appendix. 



II. TENSOR NETWORKS AND MATRIX 
PRODUCT STATES 

A. Tensor Networks 

We start with some background on tensor networks, 
since MPSs are a special case of those. A detailed intro- 
duction to the use of tensor networks in the context of 
quantum computation can be found in Refs [l7l - [l9| . 

A tensor network is a graph in which we allow some of 
the edges to be incident to only one node. These edges 
are called the legs of the network. Each node is assigned 
a tensor whose rank (number of indices) is equal to the 
degree of the node. Each index of the tensor corresponds 
to one edge that is incident to that node. To each edge (or 
index) we also assign a positive integer which indicates 
the range of the index. The indices associated with some 
of the edges in the tensor network may be assigned fixed 
values. The other edges are called free edges. 

We call an assignment of values to the indices of the 
free edges in the network a configuration. With all the in- 
dices fixed, the tensor at each node in the network yields 
a particular value. We say that the value of the configu- 
ration is the product of the values for each of the nodes. 

The value of the network is in general a tensor, whose 
rank is equal to the number of legs in the network. If 
there are no such legs, the value is simply a number (a 
scalar). Each assignment of values to the indices asso- 
ciated with the legs of the network gives rise to a value 
for the network tensor. We compute the tensor value 
for this assignment by summing over all configurations 
which arc consistent with that assignment the value of 
each such configuration. 

We note that often in the literature, one assigns values 
not to entire edges but to the two sides of an edge sepa- 
rately (where each side inherits its range of indices from 
the tensor associated with the node on that side). In the 
evaluation of the network, we require that the values on 
the two sides of one edge are equal, or else the entire 
configuration contributes zero to the sum. 

Tensors will be denoted as bold-face fonts: A, T, fi. 
Their contraction will be denoted as an expression like 
AT/x, when it is clear from the context along which in- 
dices the contraction is performed. 

It is possible to restrict a tensor of rank A: to a tensor of 
rank fc — 1 by assigning a fixed value to one of its legs. For 
example, T Q is the restriction of the tensor T to the case 
in which the relevant edge associated with the index a is 
given some value (which, by the usual abuse of notation 
of variables and their values, will also be denoted as a). 

It is convenient to associate with every tensor (which 
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can be given as a contraction of a tensor network) a quan 
turn state. For example, let F 

dcf 



a be a rank-3 tensor. 



Then we define |T) 



B. Matrix Product States 

We work in the notation of Vidal [2(| for MPSs, with 
minor changes. A MPS of a chain ofnd dimensional par- 
ticles, with bond dimension D, is a tensor network with 
a ID structure as in Fig. [TJ Horizontal edges correspond 
to indices ranging from 1 to the bond dimension D and 
are denoted with a, /?,..., while vertical edges correspond 
to indices ranging from 1 to the physical dimension d. 
(In our description, the end particles will actually have 
a different physical dimension, denoted d en d- This is re- 
quired due to a technical reason described in Sec. Ill C|) 
The indices of vertical edges are denoted with . . The 
figures show two types of nodes: black and white. The 
tensors of black nodes are typically of rank 3 (except for 
the boundary tensors, which arc of rank 2), and we de- 
note them with T's. For example, when the tensor that is 
second from left is written with its indices, it is denoted 

as Ta2,a 3 , where the index [2] in brackets corresponds to 
its location in the graph. The tensors associated with 
white nodes are always of rank 2 and are denoted with 
A's. They are required to be diagonal and hence are given 
only one index (i.e., X a i)- Without loss of generality, we 
will also demand that the entries of A are non negative 
since the phases can be absorbed in the neighboring T 
tensors. 

The MPS defined by this network is \ip) = 
Ei I ,...,i n cr ii-i»l i i> •••!*") with 



C, 



del' 



OL2 ,■ ■ ■,&■] 



p[l]'i A [2]p[2p x[3] ... A [t.] r [n]'« 



In the language of tensor states, is exactly the ten- 
sor state of the contraction T^X^T™ ■ ■ • A^rW. 



C. Canonical MPSs 

An MPS is in canonical form if every cut in the chain 
induces a Schmidt decomposition (as in Fig. [2]). In other 
words, we can rewrite the MPS by changing the order 
of summation to sum last over the index j3 of the jth 



A tensor: 



R 



denote the contraction of the all the tensors to the left 



where (r' j1 ) 



IR^' 1 )) are their 



(right) of the cut with fixed j5 and |L 
corresponding states. Then the canonical conditions are 

that for all j from 2 to n, Y,p {^ff = 1 and (La \^f) = 

(Rq'|R^') = 5 a p. In addition, for normalization, we 
require that the entire MPS state is normalized, which is 
guaranteed by the normalization requirement on the A^ 
tensors. 



There is a small technical issue that needs atten- 
tion: The canonical conditions cannot be satisfied at 
the boundaries if d < D. Consider for example the left 
boundary; there are not enough dimensions in the Hilbert 

space of the left particle for an orthonormal set of vec- 

[21 

tors |Lq ) to exist. This issue remains a problem even as 
we move away from the boundary by one particle, as the 
dimension of the left-side Hilbert space increases to d 2 
which may still be smaller than D. There are many ways 
of handling this technicality; here we choose to assume 
that the particles at the end of the chain have dimension 
of at least D. This will ensure that at any cut along the 
chain, the Hilbert space of the subsystems on each side 
have dimension of at least D. We can achieve this by 
grouping s particles at each end of the chain into a single 
particle, where s is chosen to be the smallest integer such 
that d s > D. Denote d s as d en d, the dimensionality of 
each of those end particles. Note that d en d = d s < Dd. 
The dimension of the rest of the particles will remain d. 
We renumber the particles after the grouping, so that the 
new Hi 2 is now the sum of the old ff^i+i for i ranging 
from 1 to s. The term in the Hamiltonian for the last two 
particles is adjusted in a similar manner. We will assume 
from now on that the Hamiltonian is given in this form. 

Let us now see how the canonical conditions can be 
stated in a local manner. Graphically, the second condi- 
tion is equivalent to 



r [i] 



— e— 



— e — 



[ 



(2) 



and similarly from the other side. Here the upper part 
of the network corresponds to |La'), and the lower part 
corresponds to (L»' | . Notice that the canonical condi- 
tions imply that we can "collapse" the network both from 
the left side and from the right side. Moreover, as this 
condition holds at every bond, it is not difficult to see 
that a necessary and sufficient condition for an MPS to 
be canonical consists of the following local conditions on 
(\W,TW, A^ 1 ]): For every j = 2, . . . , n - 1, 

{{X^T^) a \{X^T^) )=5 o 
{{T^X^) a \{T^X^) p )=5 



(left canonical), (3) 

(right canonical) . 

(4) 



For j = 1 and j = n, for 1 < a, (3 < D: 

(rW|rW) = (rW|rP) = ^ 

(boundary canonical conditions) 



(5) 



We also require that the A's are normalized, namely, 
that for every j from 2 to n, 

(A0|A0> = 1 . (6) 
Graphically, these conditions are summarized in Fig. [3] 
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FIG. 1. MPS as a tensor network. 
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FIG. 2. A description of a canonical MPS. The tensors are chosen such that cutting a MPS between the j — 1th and j'th 



particles corresponds to the Schmidt decomposition between the left and right parts: 



E/3 -\s 
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Any triplet (\W,TW,\V+ 1 1) = (A,I\/z) that satisfies 
the normalization and the left and right canonical condi- 
tions [Eqs. d3j) , ((4j) , and ((6])] is called a canonical triplet. 
Such a triplet can be associated with a quantum state on 
three particles |V>) = |AI» = J2 a ,t,p ^ap^nW) \i) \(3), 
with the following properties: ||^|| = 1; the Schmidt basis 
of the first particle is the standard basis, with Schmidt co- 
efficients {A Q }; and the Schmidt basis of the third particle 
is the standard basis, with Schmidt coefficients {fip}. A 
canonical MPS can thus be described as a set of canonical 
triplets (or equivalently 3-particle states) such that the 
right /i. tensor of one state is equal to the left A tensor of 
the next canonical triplet. 

Instead of describing a canonical MPS in terms of 
canonical triplets (A, T, fi), we will often describe it using 
canonical pairs (A, B), where 



B d ^ I> 



The advantage is that for canonical MPSs, the elements 
in B are always bounded (since the L2 norm of B sat- 
isfies ||.B|| = \fD; see Sec. HID]) , unlike T whose entries 
can approach infinity when the corresponding fx entries 
approach zero. 

An MPS that is described by the contraction 
rWAMrPlA^-.-ANrl"] can also be denoted as 
TNxWbWbW ...B^-ilrH No information is lost 
since /it can always be recovered from (\,B): an is the 
norm (see Sec. Ill D) of the tensor state (A£?)a: j21| 



1/2 



a/3 I 



The advantage of working with the canonical form is 
that the energy of local Hamiltonians involves only the 
local tensors, as the following figure illustrates: 



-e-...-j_e— f 



A " 



T t 



i- 



The above equality was obtained using the canoni- 
cal conditions that are described in Eq. ([2]). Conse- 
quently, the energy (tp\Hj—i j \\ip) only involves five ten- 
sors: Atf-^rb'- 1 ], Ab],rM, and A^'+i]. Similarly, H h2 
only depends on I'M, X^ 2 \ T^, A^, and #„_!,„ only de- 
pends on A^-^.rl"- 1 ], AW.rW. It is important that 
each energy term does not involve tensors further to the 
right in the chain since the algorithm attempts to com- 
pute (or approximate) the optimal MPS up to a certain 
point. We would like to be able to grow the description of 
the state from left to right, without affecting the energies 
we have already computed. If matrices in the right side 
of the chain affected energies of terms in the left side, we 
would need to go back and change the MPS matrices of 
the particles we have already handled after we make new 
assignments to particles on the right. This would ruin 
the entire idea of dynamic programming. 

Fortunately, any MPS representing a normalized state 
can be written as a canonical MPS with no increase in 
bond dimension. This follows from Ref. (20| . in which it 
is shown that any state with Schmidt rank of at most D 
across any cut can be written as a canonical MPS with 
bond dimension D. 



D. Tensor Norms and Distances 



We define fi 
pairs. 



clef 



/x(A, B) this way also for non-canonical 



We 
V- 



use 

IX;, 



the L2 norm on tensors ||A^|| 2 = 
,.i k \ 2 - This norm of course induces a metric, 
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r [j] 



AW 

A*U1 



= 1 



1 



(a) 




r m 



(b) 



A*b'+i] 



pi 1 ] 



cr;-c; c;-c 



]-;: 



(c) 



FIG. 3. (a) The normalization condition for j = 2, . . . ,n. (b) The left/right canonical conditions for j 
Eqs. ((3]) and Q]. (c) The boundary canonical conditions for j = 1 and j — n [see Eq. JjjJ], 



1 [see 



namely, a way of denning the distance between tensors of 
the same rank. It is easy to see that the norm of a ten- 
sor C is equal to the Eucledian norm of its corresponding 
state \C). Also, for a rank-2 tensor (which can be viewed 
as a matrix), it is known that its operator norm is not 
larger than its tensor norm (which in this case is simply 
the Frobenious norm). 

It is true that for any three tensors, B 1} B 2 , B, we 
have \\B X B - B 2 B\\ < \\Bi - B 2 \\ ■ \\B\\. In fact, many 
times in the context of MPSs, a much stronger inequal- 
ity holds. Assume B connects with £?i or B 2 along one 
edge, indexed by a. Assume further that ||-B Q | = 1 for 
every a (in the context of canonical MPSs, it will often 
be the case that we consider the contraction of one side 
of the chain with a fixed index a of the cut edge, and this 
contraction is indeed of norm 1 by the canonical condi- 
tions). In this case, we have a much stronger inequality, 
which can be easily verified: 



And similarly, 

\\AXB-AXB\ 



XB — XB\\ 

Y,\K\ 2 \\(B a -B a ) 

a 

< max ||_B a — B a \\ . 

a 

III. THE ALGORITHM 



"I 1/2 



(10) 



\B X B B 2 B\\ = ||Bi -B 2 \ 



(7) 



As discussed earlier, in order to carry out the outline 
described in Sec. II Al we would like to work with canoni- 
cal MPSs. Additionally, since the tensor pairs (A, B) for 
neighboring nodes overlap, we would like an e net over 
canonical pairs such that /x(A, B) could be equal to the 
A of the next pair (we call this perfect stitching). We do 
not know how to efficiently construct an e net that satis- 
fies those conditions exactly; we resort to approximately 
canonical MPSs with approximate stitching. 



We can apply this to cases of interest, when we com- 
pare contractions of tensors which differ in only a single 
term. For example, consider vector A Q with norm 1 and 
two tensors B,, ^ jj and A a j u „ j t such that when a is 
fixed, the resulting tensors A a and B a have norm 1. Let 
A, A and B be tensors with the same rank and dimen- 
sions as A, A and B. We have, by Eq. {7J, 

\\AXB-AXB\\ = \\X-X\\, (8) 

and also 

\\AXB - AXB\\ = \\AX- AX\\ . (9) 



A. e nets 

We fix e > (to be determined later) and define two e 
nets. We start with discrctizing rW and r^"'. 

Definition 1 (the G en d e net) G en d is a set of canon- 
ical boundary tensors [see Eq. such that, for any 
canonical boundary tensor T there is T € G en d such that 
for each a, \\T a — T a \\ < e. 

We now define an e net over the intermediate tensors, or 
more precisely, for the pairs (A, B). 

Definition 2 (the G e net) G is a set of pairs of ten- 
sors (A, B) such that: 

1. X is positive and normalized: For all a X a > 
and (A|A) = 1. 
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2. G is ane net: For every canonical triplet (A, f , p.) 
there is (A, B) G G such \\\t(i-\B\\ < e. 

3. B is perfectly right canonical: For every a, a' , 
(B a \B a /) = 6 aa > (here a, a' are the left Greek in- 
dices of B). 

4- (A, B) are approximately left canonical: For 

every f3 =/= /?' , 

\{(XB)p\(XB)p,)\ < 3e . (11) 

B. e net Generators 

Wc now explain how to construct such nets efficiently. 
Both generators for the e nets will make use of the fol- 
lowing general lemma 

Lemma 3 For any positive integers a < b and any v 
in the range (0,l/y/a\, we can generate a set of a x b 
matrices S a b over the complex numbers such that for any 
A G S a b, the rows of A are an ortho-normal set of length 
b vectors. Furthermore, for any a x b matrix B whose 
rows form a set of orthonormal vectors, there is a matrix 
A € S a b such that each row of A—B has L2 norm at most 
v. The size of S a b is at most 0{{72b/v) 2ah ). The time 
to generate S a b is 0(a 2 b{72b/v) 2ab ). If a = 1, we can 
generate a set of vectors with real non-negative entries, 
rather than complex. The size of the net is 0((72b/v) b ) 
and the time to generate it is 0{b{72b/ v) h ) . 

The proof appears in the Appendix. 



1. Generating G erL d'. 

Invoke Lemma [3] with v = e, a = D, and b — d en d- For 
every A e SD,d cnd , add a T to the e net, where A a ^ — 

. Note that the conditions of Lemma 02 are satisfied 
if e < 1/ \f~D. Since d en d < Dd, the size of the net is at 
most (72Dd/e) 2dD and the time to generate it is 0(dD 3 ) 
times the size of the set. 

2. Generating G: 



to invoke Lemma [3j we require that e < 2j\f~D. For any 
matrix A a up\ in the set, we generate a tensor B where 
B l a a = A a upy This way we generate a set of pairs 
(A, B) which satisfies both the normalization condition 
[condition (1) of Definition [2] and the condition of being 
perfect right canonical [condition (3) of Definition [5] . 

To see that we in fact have an e net [i.e. condition (2) 
is satisfied], consider a perfectly canonical pair (A, B), 
and let us find a pair (A, B) in the net that is e-close to 
it. We first replace A with a A from the first net and then 
replace B with a B from the second net. Using Eq. ([8]), 
we have that 

||AB-AB|| = ||A-A|| < € - , 
Using Eq. (fTOj) . we also have 

|| AB - AJ3|| < max||B Q - B a \\ < - . 

a 2 

Next, wc discard all tensors (A, B) that are not approx- 
imately left canonical, namely, those that violate condi- 
tion (4). It remains to show that the remaining tensors 
still satisfy condition (2), that is, the e net condition. We 
do that by showing that a pair (A, B) that is e close to a 
canonical triplet must necessarily be approximately left 
canonical. Therefore, such a pair would not have been 
eliminated. 

To see this, let the tensor A = XT/j, be the contraction 
of the canonical triplet and C be the contraction of \B 
from the net such that \\A — C|| < e. The fact that A is 
perfectly left canonical is expressed in the fact that for 
every j3 ^ ft', (Ap\Ap') = 0. To prove that C is approx- 
imately left canonical, we need to show KC^IC^')! < 3e. 
Indeed, \\A — C\\ < e implies \\Ap — Cp\\ < e for every 
/?. Assume (3^ /3'. Then 

\{Cp\C p ,)\ = \(Ap + {C P - Ap)\Ap, + (Cp, - Ap,))\ 

< \(Ap\Ap,)\ + \(Ap\Cp,-Ap,)\ 

+ \{Cp- Ap\A ,)\ + \{Cp- AplCp, - Ap,)\ 

< ||^||||C^-^|| + 11^1111^-^11 
+ \\Cp-Ap\\\\Cp-Ap\\ 

< 2e + e 2 < 3e . 

This concludes the proof that G is indeed an e net ac- 
cording to Definition [2j 



We generate G by first generating an e/2-net over the 
A's and an e/2-net over the S's. To generate the net of 
the A's, invoke Lemma [3] with a = 1, b = D and the v in 
the lemma set to e/2. Note that we would like to have a 
A with non negative real entries. According to Lemma [3l 
this actually requires fewer items in our net since we are 
omitting the phases in each entry in the tensor. The 
resulting net for the A's has size (lA4D/e) D and can be 
generated in time 0(D(lAiD/e) D ). 

To generate the net over the B's, we invoke Lemma [3] 
with a = D, b = dD, and v ~ e/2. Note that in order 



3. Complexity of Generating G and G m d: 



By Lemma H N = \G\, the size of the e net G is 



N = 



^144dD^ 



D+2dD J 



(12) 



This is the size of the set formed by taking all pairs 
(A, B), where each A and B come from their respec- 
tive nets. The time required to generate the original net 
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(before tensors are discarded) is 0(dD 3 N). The cost 
of checking whether a (A, B) pair is approximately left 
canonical is 0(dD 3 ), so the total cost of generating the 
net is 0(dD 3 N). 

For G en d, both the number of points and the running 
time which were determined in Sec. 1111 B 11 are bounded 
above by the corresponding bounds of G. 

C. The algorithm 

When processing particle j, the algorithm creates a list 
Lj of partial solutions, one for each (A, B) pair in G. For 
each such partial solution, a tail (i.e., the tensors to the 
left of the jth particle) and energy is kept. 

First step:: Create the hrst list L^. For each 
(\W,BM) G G, find its tail, namely the G 
G en d which minimizes the energy with respect to 
Hi 2 of the tensor rM A^-B^ . Denote this minimal 
energy by E 2 (\^- 2 \ B^). We keep both the tail and 
the computed energy, for each pair (A^, B^) G G. 

Going from j = 3 to j = n — 1:: we assume we 
have created the list For each pair 

(Ab'-^Bk'- 1 ]) G G there is a tail in Lj- X : 
r [i] , (A [2] > B m )5 (A [3] |B P!) (A b-2] ( B y_a] } 

and an energy value that we denote by 
£; i _i(A b '" 1] ,-B b '- 1] ). To create Lj, we find a tail 
for each (A^,-B^) G G. We require that the tail 
for a given (A^, B^) is an item in which sat- 
isfies the "stitching" condition: 

||Ai(A [j ' _11 ,J3 [j '- 11 ) - X b] \\ < 2e . (13) 

We pick the tail for (A^, J3^) to be an 
item in Lj-\ which satisfies the stitching con- 
dition and minimizes Hj-i tj (X^-^B^>-^B^) + 
£^_x(A^ _1 l, -B^ _1 1). The minimum such value is 
defined to be Ej(X^,B^). 

Final step:: The final step, j = n, is exactly as in 
the intermediate steps except the algorithm goes 
over r["l G G en d, rather than over pairs from G 
and there is no stitching constraint. More pre- 
cisely, we pick the tail for to be the item in 
L n _i which minimizes H n ^ hn (X^ n ^B^-^T^) + 
E n -t (At™" 1 ] , B I™" 1 ] ) . The minimal value is defined 
to be £ n (rM). 

Finally, we choose r^l which minimizes E n (T^). 
We output the MPS that is defined by and its 
tail: 

|fi) d ^ f ■ ■ ■ B^lrM) , (14) 

together with the energy which the algorithm cal- 
culated: 



Note that since each (X^\B^) is perfectly right 
canonical, the state |f2) is normalized. This can be seen 
by contracting the tensor network corresponding to the 
inner product (f2|0) from right to left. 

Unlike in the classical case, our algorithm does not 
search all states due to the discretization. More- 
over, it does not optimize over the real energy of the 
states that it does check, but rather over E a i g (£i) = 
'Zj Hj- hj (X^-^B^-^B^). E aUj is different from the 
true energy E because the states are not exactly canon- 
ical. Note that the output E a i g (ft) is thus just an ap- 
proximation of the real energy E(Q) of the output MPS 
|0). We output E a i g (Q) anyway, since our guarantee on 
its error is somewhat better than on the error for E(Cl), 
as we will see in Sec. IIVI 

The following claim easily follows from the same rea- 
soning as for the classical dynamic programming algo- 
rithm: 

Claim 4 The algorithm finds the state which 
minimizes E a i g among all MPSs of the form 
rWAPlBPlBM-.-B^rW, such that rM,rw € 
Gend, (\W,BW) G G for all j G {2,...,n- 1}, and the 
stitching conditions (Eq. U3\) ) are all satisfied. 



IV. ERROR AND COMPLEXITY ANALYSIS 

In order to finish the proof of Theorem [T] we will prove 
the theorem below. As noted above, this theorem actu- 
ally gives a better error bound on E a i g (Q,) than the bound 
on E(SY) that is given in Theorem [1] 

Theorem 5 (Error bound) Let Eq be the minimal en- 
ergy that can be achieved by a state with bond dimension 
D, and J the maximal operator norm \\Hj j+i\\ over all 
terms. Then: 

E alg (n) - 6Jne < E a < E{Q) < E alg (fl) + § JD 2 n 2 e . 

(16) 

It is easy to verify that as long as nD 2 > 12, Eq. (fT6|) 

implies Eq. (p} of Theorem [T] 

Proof: 

By definition, Eq < E(Q). We first prove that 
E alg (n) - 6Jne < E . Let: 

|V) = ifWAMfPl-.-ANrM) • 

be a state with E(%p) = Eq of bond dimension D, written 
as a canonical MPS. For every triplet (A^, f ^, A^ +1 1) 
for j = 2, . . . n — 1, we associate a pair (A^, B^) G G 
which is e-close to that triplet. In addition, we find Ti G 
G en d close to Ti and r„ G G e „d close to r„. We define 
the state: 



E alg (n) d ^ 1 E n (TW) 



(15) 



\4>) = |fiAaBaB 3 ---B„_if n ) 
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Just like |n), this state is normalized due to the fact 
that the tensors in G en d and G are perfectly right canon- 
ical. 

We first claim that E a i g (£l) < E alg (4>). This follows 
from the fact that \(j>) belongs to the set of states over 
which the dynamic algorithm searches (see Claim @|, 
since the X^'^B^' 1 ^ and X^B^ satisfy the stitching 
condition (|13|) . as promised by the following lemma: 



Lemma 6 For every j = 3, . . . , n — 1, 

IMAb'- 1 ], jjb"- 1 ]) - A b ' ] || < 2e . (17) 

Proof: We use the fact (established in Lemma |5] in 
the Appendix) that for any two bipartite states \A) = 
J2i a i\i)\ A i)i witn normalized \Ai), \B) = X^K)!- 8 *) 
with normalized \Bi) 1 we have ^ \a,i — bi\ 2 < \\A — B\\ 2 . 

The tensors A^'lf M A^ and X^B^ represent two 
quantum states on 3 particles, where in both states, the 
Schmidt basis of the first particle is the standard basis, 
and the perfect right canonical condition of Definition 
[5] (or alternatively, the condition of Equation @| holds. 
The Schmidt coefficients are given by {Aq'} and {Aq'}, 
respectively. According to the above fact (Lemma [5]) 

- A^|| < ||A^f [j]a^ +1 J - X^B^\\ < e . (18) 

Similarly, we know that 

HAb'-yf b'-^A^ - XV-^Bti-^W < e . Consider 
now these 3-particle states expanded in terms of the 
basis vectors \j3) of the third particle. Denote these 
expansions by J2p a p\ v p)\P}> with normalized \vp), and 
^2abp\wp)\j3) with normalized \wp), respectively. Then 

by definition, a = Xf, and bp = /^(A^'" 1 ], B^). 
We can therefore apply again Lemma [5] and get: 
H^Atf- 1 ),!^- 1 ]) - A^|| < e. Together with Eq. (O, 
we therefore obtain ^(At- 7 '" 1 ], B^) - A^ j| < 2e. ■ 

Thus far, we have established that E a i g (fl) < E a i g ((f>). 
We will therefore prove the inequality E a i g (fl) — 6Jne < 
Eq by showing that \E a i g ((f>) — Eo\ < 6nJe. Observe that 
each energy term in E depends solely on two overlapping 
triplets A^'lf b']A[ J+1 ]f [• 3+1 lAb +2 l. The corresponding en- 
ergy term in E a i g {4>) depends only on X^B^B^ +1 ^ . We 
now bound the distance between these two tensors. Wc 
have 

X [j] B [j] B [3+1] - x [3] t [j] x [: > +1] t [j+1] x [3+2] 
= -x^r^x^)B^ 

+ A [j]fM(Ab' +1 ] -A^ +1 l)Bb' +1 ] 

+ A [j ' ] f W(x [j+1] B [j+1] - X [j+1] f [3+1] X [j+2] ) 



Taking the LHS and RHS sides of the above equation, 
and using Eq. (JSJ) and Eq. ©, we have that 

\\X [3] B [j] B [:i+1] - x [:i] r [j] X [3+1] t [j+1] x [j+2] \\ 

< \\xV ] bW _ x^t^x^+^w 

+ || A b'+i]_Ab-+i]|| 

+ ||A [j ' +11 JB [j ' +l1 - A [i+1] f y+^X [j+2] \\ . 

The first and third term in the above sum can be bounded 
by e because of the condition of the e net G. The norm 
of the middle term is bounded in Eq. (JTHJ) . Therefore 
the norm of the difference between the tensors is at most 
3e. It follows that the difference between the two energy 
contributions is at most 6e||i?j 7 j+i || < 6eJ. 

We illustrate the boundary cases by working through 
the analysis for the left end of the chain. We want 
to bound HfWAMfMAl 3 ] - t^X^B^l Note that 
||f W(A[ 2 ]f WX^ - XWbM)\\ is bounded by e because of 
the conditions on the e net and Eq. 0. Using Eq. (|10l) . 
we have that 

IKfW-f^AWlH <max||fW-fW|| <e. 

a 

Hence, the overall bound on the difference is 2e. It follows 
that the difference between the two energy contributions 
is it most 4e||_ffi.2|| < 4e J. A similar argument holds for 

Hn—l,n • 

Now we turn to the right inequality in Theorem [5] 
and show \E(fl) - E a i g {n)\ < \,JD 2 n 2 e. We bound 
the difference in energy for each term Hj-ij. The 
contribution of this term to E a i g (fl) is calculated from 
X^-^B^-^B^. The true energy, however, depends on 
T^X^BMbW ■ ■ ■ B^ since |fi) is only approximately 
left canonical. We will show that the error accumu- 
lates linearly as we sweep from left to right, summing 
up to 3jJD 2 e for Hj_ij. Therefore, the total error is 
\E alg {Q) - E(Q)\ < yb 2 n 2 e. 

We now provide a more accurate argument. The en- 
ergy estimate for the term Hj^i j is calculated from the 
contraction XU-^B^B^. Graphically, this contribu- 
tion is given by 



A^'-H 



Ho -i 



J-i-,J 



4 



The true energy, however, is calculated from the contrac- 
tion of X^B^B^ ■ ■ ■ Bfr'l, Graphically, this is given by 
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(Notice that we have collapsed the rW terms because of 
the canonical condition [S] - see Fig. 0(b)). 

Had the state |f2) been perfectly left canonical, the two 
would have been the same. But since it is only approx- 
imately canonical from the left, there is some difference 
that can be bounded. The analysis is done iteratively 
from left to right. We start by writing 

BP! 






In this picture, the tensor Rpp' is defined to be off di- 
agonal (i.e., equal to zero on the diagonal: Rpp = 0) 
and for the (3^/3' terms, it is defined by Rpp< = 

((aW^kaWV) = E^I^S^?)*- a 

is defined by: 

A^ d ^R^+S^(\xf\ 2 -\^\ 2 ). 

Using the fact that (\^ 2 \B^) is approximately left 
canonical (see Eq. (fTTj) ). and the stitching conditions of 
Al 3 l and fi^ (see Eq. (fT5|) ), it is easy to see that for every 



A P p> \ <3e 



(19) 



We may therefore write the true energy contribution as 
the sum of 



Bb- 2 ] 




and 



Bb-21 




The analysis of the first term is done in the next iteration 
step. The second term can be seen as the error introduced 
by the fact that (X^B^) is approximately left canoni- 
cal. To estimate its size, notice that it can be viewed as 
the expectation value of the operator A <g) Hj_i j (here 
A is viewed as a matrix), with respect to the MPS that 
is described by \B^B^ ■■■B^). Using Eq. {F® and 
the assumption ||-Hj-i j\\ < J, it is easy to see that 
< 3JDe. Here, in both cases, we used 



H 



|| • || to denote the operator norm of A (8 Hj_ij, instead 
of the usual tensor norm; we can do this since the opera- 
tor norm is at most as large as the tensor norm, and the 
tensor norm of A is at most 3De. Moreover, the norm 
of the MPS \BWBW ■ ■ ■ BW) is exactly VD (it would 
have been exactly 1 had there been a A' 3 ! term before 



£?[ 3 1), and therefore the amplitude of second term is up- 
per bounded by 3JD 2 e. 

Carrying the same analysis all way to (X^~ 2 \ B^~ 2 ^), 
we end up with a term that is identical to the energy 
estimation of the algorithm, plus some error term whose 
amplitude is at most 3jJD 2 e. Therefore, by simple alge- 
bra, we have that for the total system, 

\E alg -E(Q)\ < pD 2 n 2 e . (20) 



For a target error S, we select e < jJW 5 ■ Using the 



2JB 2 n 2 

bound from Eq. (jl2|) , we get that the size of the net for 
the interior particles is 



N = O 



'/lUJdD^n 2 \ D+2dD 



(21) 



Note that in using Lemma [21 we required that e < 
1 / \J~D. It is reasonable to expect that S/Jn < 1 ( mean- 
ing that the desired error is at most the maximum energy 
in the system) which implies that this condition is met. 
The algorithm has n iterations in which 0(N 2 ) possible 
extensions for the MPS are considered. For each such 
possibility, we perform a contraction of tensors (A, B, B') 
in order to evaluate the energy of a particular term. This 
contraction takes time 0(D 3 d 2 ). Thus the total running 
time is 0(nN 2 D 3 d 2 ). 



V. COMMUTING HAMILTONIAN IN ID 

We now prove Theorem O Let us first notice that The- 
orem [T] immediately implies the first claim in Theorem [21 
namely that approximating the ground state and ground 
energy of a commuting Hamiltonian in ID to within poly- 
nomially good accuracy can be done efficiently This fol- 
lows from the well known fact that the ground state of 
a commuting Hamiltonian in ID can be described by an 
MPS of constant bond dimension. We can therefore ap- 
ply Theorem Q] to the problem, and hence approximate 
both the ground state and ground energy efficiently 

For completeness, here is a sketch of a proof of this 
fact: assume we have a 2-local commuting Hamiltonian 
in ID. If the Hamiltonian is fc-local for k > 2, just com- 
bine adjacent particles together. To see that there is a 
ground state which is described by an MPS of constant 
bond dimension, notice that for any commuting Hamilto- 
nian, there is a ground state \ip) which is an eigenvector 
of each of the terms in the Hamiltonian, with some well 
defined eigenvalue for each term. For each term, consider 
the projection onto the eigenspace corresponding to that 
eigenvalue. For any state with non-zero projection on the 
ground state, applying these projections (no matter the 
order) would result in a ground state. Since there is al- 
ways a computational basis state \w) that has a non-zero 
projection on the ground state, we can express a ground 
state as the projection of all these local terms applied 
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to \w). We first apply the projections which interact the 
pairs of particles (1, 2), (3, 4), etc; we then apply the pro- 
jections that interact the pairs of particles (2,3), (4,5), 
etc. This sequence of operations can be viewed as a ten- 
sor network of depth 2. We can thus represent the ground 
state as the contraction of a tensor network of depth 2. It 
can be easily seen that such a state must have a constant 
Schmidt rank along any cut between the left and right 
sides; to move to an MPS of a constant bond dimension, 
use Vidal's result [20| . 

Let us now provide the proof of the improvement to an 
exact algorithm, for the case that the Hamiltonian has a 
polynomial spectral gap. In other words, we are promised 
that the ground energy is separated from the rest of the 
eigenvalues of the Hamiltonian by a gap A > l/n c for 
some constant c. Notice that we don't assume a unique 
ground state. 

The first step of the proof would be to use Theorem Q] 
to find an MPS of constant bond dimension such that 
(f2|jy|f2) < Eq + A/3. From the discussion above, it is 
clear that this can be done in polynomial time. Next, 
we would like to project this MPS sequentially on some 
chosen eigenspaces of the Hamiltonians along the chain. 
As we are in a commuting system, this would result in a 
common eigenvector of all Hamiltonians, and therefore an 
eigenvector of H itself. If we manage to do this without 
increasing the energy above Eo + A, then by the existence 
of the gap, we are promised to have reached a ground 
state. 

To do this, we rely on the following lemma: 



Lemma 7 Let H = Hi be a commuting local Hamil- 
tonian system with ground energy Eq, and let \ip) be a 
state such that (ip\H\ip) = Eg + h. Consider one term Hi 
in H with k eigenvalues and projections Pi , . . . , Pk into 
the corresponding eigenspaces. For every j — 1, . . . , k } let 
\tpj) be the normalization of Pj\ip), and let Cj = (ip\Pj\ip). 
Then for any n > 2 there is always a j such that Cj > 
and (VilBIVj) <E + (l + i)h. 



Proof: As the {Hi} terms are commuting, it follows that 

(il>\H\tl>) = {MPiHPi\4') + (iP\P 2 HP 2 \ip) 
+ ... + (iP\P k HP k \iP) 

= C 1 (V>lTO 1 ) +C 2 (^ 2 |tf|V>2) 

+ . . . + c k (ip k \H\ip k } , 

with y^,j—i Cj = 1. Assume, by contradiction, that for 
every j, either Cj < or (iji^H^j) > E + (1 + ^)h. 
Then partition the k eigenspaces into two subsets: subset 
A in which the first condition holds, and subset B in 



which the second condition holds. Then 
E Q + h= (^|F|V>) 

A B 

>£o£c, + (i?o + (l + V)£> 

A ^ ' B 

= E + (l + -)hJ2 c i > 

B 

using Y^j Cj = 1. Since J2a c i - j£? = ^> we nave tnat 
Yb c 3 — 1 — Plugging this into the above equality 
implies h > h{l + -)(1 — \) which is a contradiction for 
n > 2. ■ 

We now apply the lemma sequentially to project the 
approximate state |f2) on the relevant local eigenspaces. 
We start with Hi 2, where we use h = A/3 in the lemma. 
The lemma promises the existence of a subspace indexed 
j (out of k possible js) which, if we project |f2) onto that 
subspace, the projection will not have too large energy. 
We denote Cj and Pj by cu and P12 respectively (We will 
shortly explain how all calculations required for finding 
the promised j can be done efficiently). We proceed to 
find C23 and P23 for the next term i?2,3, using the newly 
projected state, and so on up to H n n _i. After applying 
the n — 1 projections, using the lemma n — 1 times, we 
arrive to a state given by 

IV') = 1 — P12-P23 • • • Pn-l,n|fi> , 

y'Cl2C23 • ' ' C„_i >n 

which satisfies 

/ 1 \ n ~ * A A 
(i>\HH) <E +(l + -) j < E + C — . 

Using the assumption of the gap and the fact that is 
an eigenvector of H, it must be that \ip) is a ground state 
and (ip\H\ip) = E . 

We now argue why finding the j whose existence is 
promised by the lemma can be done efficiently. Consider 
for example the term H m ^ m+ i. To find the relevant j 
we have to compute, for the current state |^>), both the 
norms squared Cj = {tj)\Pj\%j}) as well as the expectation 
values (ipj\H\il)j) = ~{ip\PjHPj\ip), for all eigenspaces 
Pj of H mm+ i. Note first that we are handling here 
real numbers; the projections Pj on the eigenspaces of 
H m ,m+i may require infinite precision to describe exactly 
in binary (or any other) representation. We truncate the 
entries in the projections to exponentially good precision, 
using polynomially many bits, so that all the calculations 
can be performed efficiently. This introduces an exponen- 
tially small error. 

The expressions we are interested in calculating are all 
of the form 

(fi|Pl2 • • • Pm,ra-1 ■ PjOPj ■ P m ,m-1 ■ ■ ■ Pl2\ty , (22) 
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This structure has the property that -ffjj+i leaves the 



FIG. 4. An illustration of how the expression in Eq. (1221) is 
given as a tensor-network with a constant number of horizon- 
tal layers. Specifically, the figure describes the tensor-network 
of (f2|Pi 2 P 2 3 ■ ■ • H jtj+1 ■ ••P 23 -Pl2|fi) 



where O can be either a local Hamiltonian -ff^i+i or the 
identity, and the -P^i+i are projections on eigenspaces 
of the local terms. Recalling that |0) is a constant- 
bond MPS, and using the fact that the projections com- 
mute between themselves, we can write Eq. ([22]) as a 
constant depth-tensor network. This is done by parti- 
tioning the projections into two layers: in one layer the 
projections that work on the sites (1, 2), (3, 4), (5, 6), . . . 
and on the other, the projections that act on the sites 
(2, 3), (4, 5), (6, 7), — The resultant tensor-network is 
shown in Fig. |4j One dimensional tensor-networks with 
constant depth can be efficiently calculated on a classical 
computer because their bubble width is constant when 
swallowed from left to right [Til] . [22| 

Thus, all calculations (under our assumptions of poly- 
nomially many bits of precision of the Pj 's) can be per- 
formed efficiently. The resulting state is given by a ten- 
sor network of constant depth (namely the original |f2) 
on which the chosen projections are applied.) As before, 
this can be modified to a MPS of constant bond dimen- 
sion using Vidal's result [2(j, concluding the proof. 



A. A Proof for the Commuting_Hamiltonians case, 
based on Ref. 



First we describe the alternate algorithm assuming we 
have the ability to perform arithmetic operations with 
infinite precision and then discuss the consequences of 
limited precision. Ref. 0] prove certain properties about 
the ground states of 2-local commuting Hamiltonians in 
which the interaction graph is a general graph. We ex- 
press those properties for the special case of interest here 
in which the graph is a line. Let Hj be the Hilbert space 
of particle j . It is shown in Ref. Q that when the terms of 
the Hamiltonian commute, the Hilbert space of each par- 
ticle can be expressed as a direct sum, Hj — Q) a j1~L,j 3 , 

such that each H^ 3 ^ can then be expressed as a tensor 
product of three spaces 



H 



H 



L.J 



(«;) 



n 



R,3 



subspaces H^ 3 ^ <£>7^" J 1 +1 '' invariant, and moreover, when 
restricted to such a subspace, Hj.j + \ acts non-trivially 

feY 1 ( the ri s ht P art of wj a,) 

left part ofu[ a]+l) ^ 



( Q j+i) • 



only on H^j 



"Hlj+i (the right part of Hp' and the 
^ ) . Consequently, there exists a ground 



state which resides in some subspace H^ = (S)jHj\ 
for some choice of a±, . . . , a n . Moreover, within the sub- 
space H^ the state can be written as a tensor prod- 
uct of 2 particle states living in the spaces of the form 
n R?j ® n L tensored with some arbitrary single par- 
ticle states living in the H^j spaces. 

If the algorithm knows the correct choice of indices 
Qi,...,a n , it can find such a ground state efficiently, as 
follows. Note that the descriptions of both the spaces 



H^ 3 ^ and their divisions H y ^ 3> = H L ■ 



/("j) _ iy(«j) 



H 



H 



(°y) 



are derived from local properties of Hj imposed by the 
two local Hamiltonians Hj—i^ and Hjj + \. The subdi- 
vision of Hj in this way can be expressed as a solution 
to a set of quadratic homogeneous constraints. Since the 
dimension of Hj and hence the number of variables is 
constant, it can be efficiently computed. If the algorithm 
knows the Oj's, it therefore knows the description of the 

subspaces H^j ® H^ 3 ^ ® U-r ^ , and the restriction of 
the Hjj + i to those spaces; it therefore just needs to find 
a ground state of linearly many 2-particle Hamiltonians, 
which is an easy task. It is therefore enough for the al- 
gorithm to find the correct ot\, . . . , a n indices. 

We will do this using dynamic programming. The crit- 
ical point in using dynamic programming here is that the 
energy contribution of Hg^+i depends only on the choice 
of ct£ and a^+i, so the choice of ctk for k < j — 1 does not 
affect the energy of the -f/^+i terms for any I > j. Using 
this observation, the algorithm proceeds from left to right 
as follows. For the first term Hi ^, the algorithm finds 
the division into a direct sum of subspaces for particles 1 
and 2. The algorithm keeps an optimal state (choice of 
ct\) and energy for each possible a<x. 

Then, in a general step, we assume at particle i we 
have the following information for each index a j : a list of 
indices oc\, . . . , such that the ground energy of the 
Hamiltonian of particles 1 , . . . , i restricted to the sub- 
spaces Hi ® . . . ® h\°^ is minimal. To continue to 
the next particle, we first compute the division into sub- 
spaces for particle i + 1, indexed by Oj+i, and optimize 
for each subspace in turn. For each subspace, we consider 
all items in the previous list; for each item, we have a list 
of subspaces, one for each particle. We compute the min- 
imal energy for each such restriction, including now the 
Hi^ + \ term in the calculation of the energy, restricted 
according to subspaces Oj+i and on, the last choice com- 
ing from the list. We pick the tail of the subspace of the 
i + 1 particle to be the one which minimizes the terms 
up to that point. 

Notice that in each step the dynamic program com- 
pares partial energies emerging from restricting the state 
to a different sector in the Hilbert space. These ener- 
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gies can be computed efficiently with polynomially many 
bits, namely up to exponentially good precision. Thus, 
this second algorithm achieves exact results for a some- 
what larger set of Hamiltonians than our first algorithm, 
namely those for which the partial energies will not be 
confused if the computations are done with exponentially 
good precision. 

Note that even with this extremely good resolution, 
it might be the case that the ground energy is confused 
with a slightly excited energy which is, say, doubly ex- 
ponentially close. We do not know of a good condition 
which would rule out the possibility of such very close 
energies, except for some very trivial assumptions such 
as requiring that all eigenvalues arc integer numbers. For 
example, even if we require that the different entries in 
the terms in the Hamiltonian are all rationals smaller 
than 1 with denominator upper bounded by a constant, 
it is still not known how to rule out the possibility that 
two eigenvalues of the overall Hamiltonian are doubly 
exponentially close. This issue touches upon an open 
question in number theory related to sums of algebraic 
numbers - see the open problem described in Ref. [23| . 
which can be traced back to Ref. [24{ (if not earlier), and 
also Ref. 1251 and references therein. 
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Appendix: Proofs of lemmas 

Proof of Lemma f3[ - 

Let 8 = v/72b. We will occasionally use the assump- 
tion that 8 < 1 /72b. 

First we create a set R(8) of real numbers in the in- 
terval [0, 1] such that for any real number in the range 
[0,1], it is within 8 of some element in R(8). R(8) will 
have [1/25] elements. To create R(S), we add (2j + 1)8 
for each integer j in the range from through [1/25] — 2. 
Note that the largest point in R(S) so far is in the range 
[1 - 38, 1 - 8). Then we add 1 - 8 to R(8). 

Then using R(8), we create a set C(8) which is a set 
of complex scalars which form a net over all complex 
scalars with norm at most 1. Include xe l2ny , for every 
x,y G R(S). There arc [1/25] 2 < {1/S) 2 points in C{8). 
For any complex number c if norm at most 1, there is a 
number d in C{8) such that \c— c'\ < 28. 



To generate S a ,b, consider first the set Si of of all pos- 
sible a x b matrices with entries from C(8). This set 
contains |C((5)| ab matrices. In the case where a = 1 and 
we only want entries with real, non-negative coefficients, 
we use R(8) for the entries instead of C(8) and the set 
contains |i?(5)| h matrices (in fact, vectors). Then: 

1. Remove any matrix from Si which has a row whose 
norm is greater than 1 + \^b2S or less than 1 — \/b25, 
to get 52. 

2. Rcnormalize each row in every matrix in S2 to get 
S3. 

3. Remove any matrix from S3 which has any two rows 
whose inner product is more than 9\/b8. 

4. For any matrix in S3, Apply the Gram-Schmidt 
procedure to the rows to form an orthonormal set. 

We claim that the final set is the desired S a ^. 
Note that the number of matrices is 0((l/8) 2ab ) = 
0{{72b/v) 2ab ), and the running time to produce the set 
is 0(a 2 b(l/S) 2ab ) = 0{a 2 b{72b/v) 2ab ) as required. What 
remains to show is that if A is any a x b matrix whose 
rows form an ortho-normal set then we can find a matrix 
B in S a ^b which is close to it. 

Let W be an ax b matrix. Wc will denote it's i th row by 
Wi. Define the distance between two matrices d(W,W) 
to be max, ||Wj — W-\\. Let X be the matrix obtained by 
rounding every entry in A to the nearest complex number 
in C(8). Let Y be the matrix obtained after the rows of 
X are normalized and let Z be the matrix obtained after 
the rows of Y are transformed into an ortho-normal set 
via the Gram-Schmidt procedure. We need to prove that 
d(A,Z) < v, and to show that Z € S a ^-, which would 
imply together that we can choose B in the lemma to be 
equal to Z . 

We will now prove both of the above claims. For the 
second part we need to prove that X survives step 1 and 
Y survives step 3. 

X survives step 1: Since each entry in A — X has 
magnitude at most 28, we know that d(A,X) < \fb28. 
In order to bound the norm of X4, observe that 



Vb28> \\Ai-XiW > \\\Ai 



\Xi 



Since \\Ai\\ = 1, it follows that ||X;|| lies in the range 
from 1 — \/b28 to 1 + \/b2S and it will survive Step 1. Wc 
have: 



d(X,Y) < max ||^ 

i 

Vb28 



1 



1 - Vb28 



A'\ 



1 - Vb28 



< V&25(36/35) 



The latter inequality uses the assumption that 8 < 
\/72\fb. Using the triangle inequality for our distance 
d(-), we have that for any i \\Ai - Yi]\ < (4 + ^)VbS. 
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Y survives step 3: Now we need to bound the inner 
product of any two rows of Y in order to establish that 
it is not removed in Step 3: 

\{Yl\Yj)\ = \(A + (Yi A t )\A 3 + {Y j Aj))\ 
< KM^l + \(Y i -A i \Y j -A j }\ 



\{Yi - AAAj)\ + \{AAYj - Aj)\ 



< \\Yi — \\Yj — A., 



\Yi 



\A, 



\AA\\\Y-A, 



< Vb~5 



< 9VbS . 



2 

35 



+ 2(4 



2 

35 



The second inequality uses the Chauchy-Schwartz in- 
equality. The last inequality uses the fact that y/bS < 
1/72. 

Bounding the distance d(A, Z): Finally, we need 
to consider how much the matrix shifts as a result of 
the Gram-Schmidt procedure, to bound d(Y,Z). Let 
fi = 9s/b8 = %v/72\/b. Since a < b, by assumption in the 
lemma, we know that \i < 9vjl2^fa. We use this latter 
bound in the next part of the proof since we are bounding 
quantities by a function of a instead of b. Since we assume 
that v < \jyfa, we can assume that a/i < 9/72. Recall 
that the Gram-Schmidt procedure starts with Z\ = Y\. 
Then each Zi is determined by first creating an unnor- 
malized state Z^. 

i-l 

Z i = Y i -Y^(Z j \Y)Z j . 
i=i 

Then Zi is normalized to 1. We will prove the following 
two properties by induction in i, 

1- < 2 M for all 3 sucn tnat 3 > i 

2. 1 - < \\Zi\\ < 1 + 2vV- 

Z\ is not defined, but we can take it to be Z\. The two 
properties clearly hold for Z\. Now by induction 

i-l 

11^.11 =11^-^(^)^-11 



j'=i 



i-l 



3=1 



i-l 



1/2 



=1+ \ YS z i\ Y i)(nzi) 

<1 + 2-y/a// 

A similar argument can be used to show that 1 
2^fa\i — \\Zi\\- Next we establish Property 1: 



\{Y k \Zi)\ 



1 



(y k \Yi} -Y,(Zj\Yi)(Yk\Zj) 

3=1 



<- 



1 — 2 v / a/i 



i-l 
3=1 



< M(l+4a/z) 
— 1 — 2 v / a y u ~~ 



The first inequality follows from the inductive hypothesis. 
The last inequality make use of the fact that a/j < 9/72. 
Finally to bound II Yi — ZA\, we have 



\Yi-ZAW < 



i-l 



< 



1 — 2^/a/i 1 — 2\fa[i 



3 = 1 
i-l 



1/2 



J =1 



The last inequality uses again the fact that ^fa\x < 
9/72. The total distance between A and Z is at most 
5VbS + Qy/aiJL. Plug ging in [i = 9\/bS and using the fact 
that a < 6, we get an upper bound of 59&<5 < v on the 
distance of A to Z, using the definition of S. ■ 



Lemma 8 Let \A), \B) be two two-particles states that, 
and expand them in the standard basis of the first particle: 



L4) = $>K)IA-> ■ 

i 

\B) = Y^k\im) , 



such that \Ai) are normalized but not-necessarily orthog- 
onal to themselves and similarly the \Bi). Then 



Ik 



Proof: 



b\\= (j2\ai-bA <\\A-B\\. (A.l) 



&|| a = X>-&,| 2 

i 
i 

i 

= \\\A)-\B)\\ 2 . 
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